Skip to content

Updating the Rollover page #2589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Updating the Rollover page #2589

wants to merge 9 commits into from

Conversation

yetanothertw
Copy link
Contributor

@yetanothertw yetanothertw commented Aug 15, 2025

As per feedback received and collected in #1563 about the Rollover page, we're updating this page to improve the information quality and relevance.

The content has been restructured into the following sections:

  • The Rollover page includes a general description/introduction of the concept above the fold.
  • The How rollover works in ILM section ties in the concept or rollover in to the context of ILM. Some older admonitions that existed on this page have been restructured and integrated into logical compartments.
  • The Recommended approaches section aims to explain (at a very high level) how a users approach to index rotation might differ depending on their use case.
  • Rotating your indices with data streams introduces the recommended approach and gives an overview of some of the requirements. It also links to Data streams documentation where naming patterns/generation are explained in more detail. Another important link is that it now logically ties this section in with the child page (the tutorial) that offers step-by-step guidance.
  • Rotating your indices with aliases introduces the other use case of using the rollover action and also links it with its respective tutorial steps

Fixes #1563

As per internal feedback, we're updating this page.

Fixes #1563
Copy link

github-actions bot commented Aug 15, 2025

🔍 Preview links for changed docs

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look at this, I left quite a few comments.

@yetanothertw
Copy link
Contributor Author

Thank you so much for reviewing, @dakrone! I've addressed your comments (hopefully correctly) -- that's been very useful.

One thing I think I struggle to understand, and maybe you can help with, is when/under what conditions would a user prefer to rotate their indices using aliases as opposed to data streams? Would that be attributable to legacy reasons mostly or is there a specific problem they would be solving that way? I'm wondering how to provide more context around the when and why one would do that.

The text previously said:

Data streams are designed for append-only data, where the data stream name can be used as the operations (read, write, rollover, shrink etc.) target. If your use case requires data to be updated in place, you can instead manage your time series data using index aliases.

And as you've pointed out, documents can be updated or deleted in the backing index that contain them using the API. So this is not a relevant reason for choosing the aliases approach.

@dakrone
Copy link
Member

dakrone commented Aug 18, 2025

One thing I think I struggle to understand, and maybe you can help with, is when/under what conditions would a user prefer to rotate their indices using aliases as opposed to data streams? Would that be attributable to legacy reasons mostly or is there a specific problem they would be solving that way?

Certainly, the specific reason that users use aliases and ILM is that they can then avoid rollover. For example, if a user does time-based indices, they can have an alias called mydata backed by the concrete indices mydata-jan-01, mydata-jan-02, mydata-jan-03, etc (some users do daily, some monthly, weekly, etc). They then index into the correct backing index depending on the timestamp of the document.

The main reason that users wish to avoid rollover is to solve the duplicate handling with rollover problem. It's something we plan to address, but it's a real issue for some users.

@yetanothertw yetanothertw self-assigned this Aug 19, 2025
@yetanothertw yetanothertw marked this pull request as ready for review August 19, 2025 14:45
@yetanothertw yetanothertw requested a review from a team as a code owner August 19, 2025 14:45
@yetanothertw yetanothertw added the documentation Improvements or additions to documentation label Aug 19, 2025
@yetanothertw yetanothertw requested a review from kilfoyle August 19, 2025 14:47

We recommend using [data streams](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-create-data-stream) to manage time series data. Data streams automatically track the write index while keeping configuration to a minimum.
The rollover feature is an important part of how [index lifecycle](../index-lifecycle-management/index-lifecycle.md) (ILM) and [data stream lifecycles](../data-stream.md) (DLM) work to keep your indices fast and manageable. By switching the write target of an index, the rollover action provides the following benefits:
Copy link
Contributor

@kilfoyle kilfoyle Aug 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone I think this and line 19 are based off of your comment. Do you think it's okay for us to introduce "DLM" here as an acronym for "data stream lifecycle management"? I'm a little hesitant since I gather DLM is already in popular use as "data lifecycle management" (e.g. IBM, HP), and our use here specifically to denote "data stream lifecycle" doesn't quite match. That is, to my mind "ILM" and "data stream lifecycle" are two different approaches to DLM.

What would you think of our sticking with:

  • index lifecycle management (acronym: "ILM")
  • data stream lifecycle (no official acronym)

If we do introduce an acronym for "data stream lifecycle" we should also update this main page about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can stick with ILM to refer to the data stream's lifecycle, since they are different (ILM has policies, is attached at the index level, has separate APIs, is not available in Serverless, etc).

The challenge that we've run into with "data stream lifecycle" is that absent an acronym, folks have been calling it "DSL", which is a MUCH more confusing acronym than "DLM", so we've been substituting DLM for the name to differentiate it from ILM, but still indicate it is the "Data (in the data stream) Lifecycle Management".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roger that. Thanks for the clarification! DLM it is. :-)

@kilfoyle
Copy link
Contributor

Very nice work on this @yetanothertw!

@yetanothertw
Copy link
Contributor Author

Hi @dakrone, when you get the chance, would you mind having another look at this PR and approve if you're happy that the requested changes were addressed (or request new ones).

Many thanks! 🙏

Copy link
Contributor

@kilfoyle kilfoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🦖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Data lifecycle docs: “Rollover” improvements
3 participants